For my final project I have chosen to compete in a kaggle competition.
The competition is being put on by HM clothing and the goal is to predict what shoppers will buy in the next 7 days.
The competition details are here -> https://www.kaggle.com/c/h-and-m-personalized-fashion-recommendations
The data comprises of 4 files:
The size of the data will be shown in the EDA part one section of the report.
My strategy is to first clean the data so that I can feed multiple days of shopping data into the model. The grain of the data will be the customer ID which means I will need to change the shape of the data. By encoding features and rejoining to the dataset I should be able create the features. The dataset is extremley large so in order to run the model I will need to reduce the size by grouping customers. I am using age to perform this grouping.
My strategy to make a prediction is to use 2 models. As you will see from the EDA most customers do not buy something every week. The first model used will make a binary prediction, buy = 1 , no buy = 0. For the customers that I believe will buy I use KNearest neighbors to identify what they will buy.
I am using scikit learn for the KNN and lightgbm for the binary classification. I am using Optuna to optimize parameters for both.
The models learns and does a good job predicting the next weeks buying pattern however, the dataset size presented many challenges. In order to upload a complete submission file, this project will need to be rewritten in a python file that allows better segmentation of the data.
The EDA done on this data and the feature and model design performed in this notebook will allow the smooth transition from data/feature/model discovery to productionization and submission to Kaggle.
I learned a few new skils from this project. I was exposed to new features of optuna and it was the first time I used pandas_profiling which I highly recommend and will continue to use.
The gitlab repo for this project is ->
In that repo I have stored a pdf of the completed markdown file in case you do are unable to run. I have also stored the graphical outputs in the folders, eda_analysis and optimization_graphs within the repo
#from IPython.core.display import display, HTML
#display(HTML("<style>.container { width:100% !important; }</style>"))
import pandas as pd
import numpy as np
import sklearn as sk
from datetime import timedelta
from sklearn.preprocessing import LabelEncoder
from sklearn.impute import KNNImputer
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import scipy as sp
import math
import statistics
from pandas_profiling import ProfileReport
import lightgbm as lgb
import optuna as op
from optuna.visualization import plot_optimization_history
from optuna.visualization import plot_intermediate_values
from optuna.visualization import plot_parallel_coordinate
from optuna.visualization import plot_contour
from optuna.visualization import plot_slice
from optuna.visualization import plot_param_importances
import pickle
import plotly
import nbconvert
The below two code snippets show how I am ingesting the data. The dataset is extremely large so I needed to be creative with how it was ingested. I used the date field as a way to pair down the data while developing. I will then scale back up once ready to make a final prediction.
trainingdata = pd.read_csv('data/transactions_train.csv')
articlesdata = pd.read_csv('data/articles.csv')
customersdata = pd.read_csv('data/customers.csv')
trainingdata['t_dat'] = pd.to_datetime(trainingdata['t_dat'], yearfirst = True)
print(trainingdata['t_dat'].max())
max_date = trainingdata['t_dat'].min() + timedelta(days=140)
min_date = trainingdata['t_dat'].max() - timedelta(days=70)
print(min_date)
print(max_date)
trainingdata=trainingdata.loc[trainingdata['t_dat'] > min_date]
trainingdata['days_from_start'] = (trainingdata['t_dat'] - trainingdata['t_dat'].min()).dt.days.astype('int16')
trainingdata['bucket'] = (trainingdata['days_from_start'] / 7).apply(np.floor)
2020-09-22 00:00:00 2020-07-14 00:00:00 2019-02-07 00:00:00
The data comes in 3 seperate files. In order to create features to train a model on and perform EDA I need to join the data together. At this point I have a lot of data stored on my RAM so I start deleting unused data structures.
#merged_set = trainingdata.merge(articlesdata, left_on='article_id', right_on='article_id',
# suffixes=('_left', '_right'))
merged_set =trainingdata.merge(customersdata, left_on='customer_id', right_on='customer_id',
suffixes=('_left', '_right'))
del trainingdata
The first round of exploration comes after the data is joined together and in a raw form. At this point I am looking for datapoints to remove because they are not complete enough to provide substance to the model. I will look at correlation further on when the data is in the format that will feed into the model.
The output is saved as an html file in the folder eda_analysis. An example has been uploaded to gitlab.
profile = ProfileReport(merged_set, title="Pandas Profiling Report", explorative=True)
profile.to_notebook_iframe()
profile.to_file("eda_analysis/eda_part_1.html")
Summarize dataset: 0%| | 0/5 [00:00<?, ?it/s]
Generate report structure: 0%| | 0/1 [00:00<?, ?it/s]
Render HTML: 0%| | 0/1 [00:00<?, ?it/s]
Export report to file: 0%| | 0/1 [00:00<?, ?it/s]
From the first round of EDA I see that there are multiple features with many nulls to be removed. These data points are FN and Active. The rest of the fields need to be one hot encoded to fit within the data model with a customer id salt.
x = merged_set.drop(columns=['t_dat','article_id','price','sales_channel_id','days_from_start','bucket','FN','Active'])
x.drop_duplicates(inplace=True)
x.reset_index(inplace=True)
x.head()
| index | customer_id | club_member_status | fashion_news_frequency | age | postal_code | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0001d44dbe7f6c4b35200abdb052c77a87596fe1bdcc37... | ACTIVE | Regularly | 44.0 | 930b19ae7db8abb5a27f4da10217755a7305b4c452f5e0... |
| 1 | 66 | 0008d644deb96bdc0ca262f161cf6d5e9a4e619bb75faa... | ACTIVE | NONE | 32.0 | 7d1a68652e11ef1653149e002e3508dabfe095f4cd93c4... |
| 2 | 70 | 000da6daeb90ef9a70238bf9b1aa54c7ce40a5e0fcf220... | ACTIVE | Regularly | 42.0 | a67d8a85fb9cd7728a4a7a02244328b27bdb04e93a7b21... |
| 3 | 82 | 000fb6e772c5d0023892065e659963da90b1866035558e... | ACTIVE | Regularly | 42.0 | 68ca4d9d6051d9c10b917d36bf9cb4afbadc551f7e4feb... |
| 4 | 140 | 00194061f3caa80bf10d615bf406bc5959a3bd799e4f21... | ACTIVE | NONE | 24.0 | b8d066ae8df3d645c85a5ba869e2fa7b528cf27d92dc06... |
#pivoted_merged_data = merged_set.pivot_table(index='customer_id', columns='t_dat', values='article_id', aggfunc=lambda x: ' '.join(x))
pivoted_merged_data = merged_set.pivot_table(index=['customer_id'],
columns=['bucket'],
values=['article_id'],
aggfunc=lambda x: ' '.join(str(v) for v in x))
pivoted_merged_data.head()
| article_id | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| bucket | 0.0 | 1.0 | 2.0 | 3.0 | 4.0 | 5.0 | 6.0 | 7.0 | 8.0 | 9.0 |
| customer_id | ||||||||||
| 00000dbacae5abe5e23885899a1fa44253a17956c6d1c3d25f88aa139fdfc657 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 568601043 | NaN | NaN |
| 000058a12d5b43e67d225668fa1f8d618c13dc232df0cad8ffe7ad4a1091e318 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 794321007 | NaN |
| 00006413d8573cd20ed7128e53b7b13819fe5cfc2d801fe7fc0f26dd8d65a85a | NaN | NaN | NaN | NaN | 896152002 730683050 927530004 791587015 | NaN | NaN | NaN | NaN | NaN |
| 0000757967448a6cb83efb3ea7a3fb9d418ac7adf2379d8cd0c725276a467a2a | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 719530003 448509014 | NaN |
| 0000945f66de1a11d9447609b8b41b1bc987ba185a5496ae8831e8493afa24ff | 760084003 760084013 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
training_data_set = x.join(pivoted_merged_data,on=['customer_id'], how='inner')
training_data_set.drop(columns=['index'], inplace = True)
training_data_set = training_data_set.loc[training_data_set['age'] < 30]
training_data_set.head()
d:\soprisanalytics\kaggle\dtsa-5509-supervised-learning-final-project\venv\lib\site-packages\pandas\core\frame.py:9130: FutureWarning: merging between different levels is deprecated and will be removed in a future version. (1 levels on the left,2 on the right) return merge(
| customer_id | club_member_status | fashion_news_frequency | age | postal_code | (article_id, 0.0) | (article_id, 1.0) | (article_id, 2.0) | (article_id, 3.0) | (article_id, 4.0) | (article_id, 5.0) | (article_id, 6.0) | (article_id, 7.0) | (article_id, 8.0) | (article_id, 9.0) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 00194061f3caa80bf10d615bf406bc5959a3bd799e4f21... | ACTIVE | NONE | 24.0 | b8d066ae8df3d645c85a5ba869e2fa7b528cf27d92dc06... | 796535007 759482003 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 572998005 | NaN |
| 5 | 001cd4541544c87a2c9fb19bad430646eadc24226d8358... | ACTIVE | NONE | 26.0 | 27c895023a1581925eabd7adbb36810765c018fa818651... | 897693002 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 6 | 001ea4e9c54f7e9c88811260d954edc059d596147e1cf8... | ACTIVE | NONE | 25.0 | 63c82aec4da25b6b5f967abe4e17d112b24e53643e916d... | 887770004 878941004 | NaN | NaN | NaN | 882899012 860949002 739590032 | NaN | NaN | NaN | NaN | 863646005 |
| 7 | 001eacd7b51c28d306e48b93b28d5f209a1da073bb94c7... | ACTIVE | NONE | 27.0 | e1474f9a871813651183c7fc5555a9d5d83106a5818528... | 859105008 | 886241003 852746001 817472007 914966002 900659... | 902694001 881933001 | 837741001 850259002 | NaN | NaN | NaN | NaN | NaN | NaN |
| 11 | 002efd1ae90ecbd94196276378da2ac5ff49fe79ee13c2... | ACTIVE | Regularly | 21.0 | 7e42dc67e7fea94838531e8d556c0c21ba19365b61ae61... | 833530005 806241007 833499005 833499005 941005... | NaN | NaN | NaN | NaN | 856270003 856270003 918292011 918292011 827968... | NaN | 935689001 935689001 863583002 935092001 920084... | NaN | NaN |
In this part I am imputing any NAN data points with NO if they are a string and the most common value if they are a number. The string columns are then encoded so that they work better in the algorithms.
#del x
#del pivoted_merged_data
#training_data_set.rename(columns={"('article_id', 0.0)": "0", "('article_id', 1.0)": "1"}, errors="raise", inplace = True)
#training_data_set["('article_id', 0.0)"].head()
i = 0
for each in range(5,len(training_data_set.columns)):
training_data_set.columns.values[each] = f'{i}_week'
i+=1
for each in range(0,4):
column_name = training_data_set.columns.tolist()[each]
if training_data_set[column_name].dtypes == 'object':
#training_data_set[column_name]=training_data_set[column_name].replace(np.nan, 'No')
training_data_set.iloc[:,each].replace(np.nan, 'No',inplace = True)
else:
training_data_set.iloc[:,each].replace(np.nan, statistics.mode(training_data_set[column_name]),inplace = True)
#training_data_set[column_name]=training_data_set[column_name].replace(np.nan, statistics.mode(training_data_set[column_name]))
i=0
for each in range(5,len(training_data_set.columns)):
column_name = training_data_set.columns.tolist()[each]
training_data_set.iloc[:,each].replace(np.nan, '1',inplace = True)
i+=1
training_data_set = training_data_set.iloc[:500000].sort_values(by=['age'])
training_data_set.head()
| customer_id | club_member_status | fashion_news_frequency | age | postal_code | 0_week | 1_week | 2_week | 3_week | 4_week | 5_week | 6_week | 7_week | 8_week | 9_week | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 294244 | 588e32d81a3c5eda8439b88654cb2f7acb336c9e028ca7... | ACTIVE | NONE | 16.0 | c49603b16392d7701f70a9eee862d321eb171d3a18e4d8... | 1 | 1 | 1 | 1 | 1 | 924142001 | 1 | 1 | 1 | 1 |
| 260706 | 5374130b2d260391c22de34dd9237c9c5bcc320b1fec31... | ACTIVE | NONE | 16.0 | 9f083c65954fb1ceb1cfd74d2e6dba70428d9005743084... | 1 | 1 | 1 | 1 | 806388028 916468003 806388018 | 1 | 1 | 1 | 806388002 | 1 |
| 365141 | 7beee2baccfda3501868a56642b738bc52bcf1804d4612... | ACTIVE | Regularly | 16.0 | 3f2c72cff97fb4cd9e85a5f351f4fc4dc2206f99aad8f8... | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 456163085 926502001 809961002 | 1 | 1 |
| 223433 | f0d47c078e1ae6a6c3552a8bd7c7c4226db650136b487a... | ACTIVE | NONE | 16.0 | b8572c20d2610a46bbdd304058a19a7dec301a892424e2... | 1 | 1 | 1 | 850426003 855793002 | 1 | 1 | 1 | 801673001 904424001 911214001 | 1 | 1 |
| 199856 | fd90cc5dabce374693d54684cc8dba552f2a40c9a2619f... | ACTIVE | NONE | 16.0 | f212e80f468490c6f135da6c0910d01e19a420d7deb4c8... | 1 | 1 | 910199002 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
customer_id_encoding = LabelEncoder()
training_data_set['customer_ids'] = customer_id_encoding.fit_transform(training_data_set['customer_id'])
club_member_status_encoding = LabelEncoder()
training_data_set['club_member_statuses'] = club_member_status_encoding.fit_transform(training_data_set['club_member_status'])
fashion_news_frequency_encoding = LabelEncoder()
training_data_set['fashion_news_frequencys'] = fashion_news_frequency_encoding.fit_transform(training_data_set['fashion_news_frequency'])
postal_code_encoding = LabelEncoder()
training_data_set['postal_codes'] = postal_code_encoding.fit_transform(training_data_set['postal_code'])
X = training_data_set.drop(columns=['customer_id','club_member_status','fashion_news_frequency','postal_code'])
column_to_move = X.pop("customer_ids")
X.insert(0, "customer_id", column_to_move)
column_to_move = X.pop("club_member_statuses")
X.insert(0, "club_member_status", column_to_move)
column_to_move = X.pop("fashion_news_frequencys")
X.insert(0, "fashion_news_frequency", column_to_move)
column_to_move = X.pop("postal_codes")
X.insert(0, "postal_codes", column_to_move)
The secound round of exploration comes after the data has been pivoted and is being prepared for the training. At this stage I am looking at the label which is the last week of data. I see that most of the data points including the label have high cardinality meaning that some algorithms wont work to predict it.
profile = ProfileReport(X, title="Pandas Profiling Report", explorative=True)
profile.to_notebook_iframe()
profile.to_file("eda_analysis/eda_part_2.html")
Summarize dataset: 0%| | 0/5 [00:00<?, ?it/s]
Generate report structure: 0%| | 0/1 [00:00<?, ?it/s]
Render HTML: 0%| | 0/1 [00:00<?, ?it/s]
Export report to file: 0%| | 0/1 [00:00<?, ?it/s]
The model uses 2 different algorithms. It uses a knn and a gradient boosted machine package called lightgbm. Further it uses optuna to optimize the algorithms. Because the data is extremely large I partion it by age. From inspection I can see that most of the time the label is null meaning that the customer didnt buy anything. I use lightgbm as a binary classifier to predict whether the consumer will buy anything. I use KNN to determine what they will buy based on the closest neighbors. The model returns the optimized parameters for each age segment.
I use the F1 Score as my metric to optimize because of the large presence of nulls in the label. A pure accuracy score may lead the model to predict only null and show good accuracy.
class neighborObjective(object):
def __init__(self,X):
i = 0
labels =[]
X = X.sample(5000)
for each in range(5,len(X.columns)):
labels.extend(X.iloc[:,each].tolist())
i+=1
labels = list(set(labels))
encoder = LabelEncoder()
fit = encoder.fit(labels)
self.null = fit.transform(['1'])
i = 0
for each in range(5,len(X.columns)):
original_column = f'{i}_week'
new_column = f'{i}_weeks'
X.loc[:, (f'{i}_weeks')] = fit.transform(X[f'{i}_week'])
#X[f'{i}_weeks']=X[f'{i}_weeks'].replace(0, np.nan)
X = X.drop(columns=[f'{i}_week'])
i+=1
X_raw = X
N = len(X_raw.columns)
last_column = N-1
second_to_last_column = N-2
X = X_raw.iloc[: , 0:second_to_last_column]
y = X_raw.iloc[: ,last_column]
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.20)
def knn(self,trial):
n = trial.suggest_int("neighbors", 2, 20, log=True)
neigh = KNeighborsClassifier(n_neighbors=n)
neigh.fit(self.X_train,self.y_train)
neighbors = pd.DataFrame(neigh.kneighbors(return_distance=False))
prediction = pd.DataFrame(neigh.predict(self.X_test))
return neighbors, prediction
def lightgbm(self, trial, neighbors, prediction):
X_train = self.X_train.join(neighbors)
X_test = self.X_test.join(neighbors)
X_train = self.X_train.join(prediction)
X_test = self.X_test.join(prediction)
y_train = self.y_train
y_test = self.y_test
y_train[self.y_train == self.null[0]] = 1
y_train[self.y_train != 1] = -1
y_test[self.y_test == self.null[0]] = 1
y_test[self.y_test != 1] = -1
dtrain = lgb.Dataset(X_train, label=y_train)
dvalid = lgb.Dataset(X_test, label=y_test)
param = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": 0,
"boosting_type": "gbdt",
"num_leaves": trial.suggest_int("num_leaves", 2, 256),
"feature_fraction": trial.suggest_float("feature_fraction", 0.4, 1.0),
"bagging_fraction": trial.suggest_float("bagging_fraction", 0.4, 1.0),
"bagging_freq": trial.suggest_int("bagging_freq", 1, 7),
"min_child_samples": trial.suggest_int("min_child_samples", 5, 100),
"early_stopping_round":1
}
gbm = lgb.train(param, dtrain, valid_sets=dvalid)
preds = gbm.predict(X_test)
pred_labels = np.rint(preds)
accuracy = sk.metrics.f1_score(y_test, pred_labels, average='macro')
return accuracy
def __call__(self, trial):
neighbors, prediction = self.knn(trial)
accuracy = self.lightgbm(trial, neighbors, prediction)
return accuracy
if __name__ == "__main__":
print(np.unique(X['age'],return_counts=True))
best_models=[]
age_iteration = 5
age_range = range(int(min(X['age'])),int(max(X['age'])),age_iteration)
plots = []
for young_age in age_range:
print(young_age)
old_age = young_age+age_iteration
print(old_age)
age_study = X.loc[(X['age'] >= young_age) & (X['age'] <= old_age)]
print(len(age_study))
age_study = age_study.iloc[:10000]
study = op.create_study(direction="maximize")
accuracy = study.optimize(neighborObjective(age_study), n_trials=20)
fig = plot_optimization_history(study)
fig.write_html(f"optimization_graphs/{young_age}_optimization_history.html")
plots.append(fig)
fig = plot_parallel_coordinate(study)
fig.write_html(f"optimization_graphs/{young_age}_parallel_coordinate.html")
plots.append(fig)
fig = plot_contour(study)
fig.write_html(f"optimization_graphs/{young_age}_contour.html")
plots.append(fig)
fig = plot_slice(study)
fig.write_html(f"optimization_graphs/{young_age}_plot_slice.html")
plots.append(fig)
#plot_param_importances(study)
best_models.append([study.best_params,young_age,age_iteration,accuracy])
print(study.best_trial)
[I 2022-03-01 17:31:18,855] A new study created in memory with name: no-name-c1013a18-b801-4825-8a8d-e0dac321cfd4
(array([16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28.,
29.]), array([ 65, 3942, 7539, 12494, 18702, 22824, 18630, 20515, 21089,
20493, 19967, 17602, 15538, 14031], dtype=int64))
16
21
65566
[I 2022-03-01 17:31:19,373] Trial 0 finished with value: 0.4617868675995694 and parameters: {'neighbors': 3, 'num_leaves': 173, 'feature_fraction': 0.5186323020532635, 'bagging_fraction': 0.6163552066220521, 'bagging_freq': 6, 'min_child_samples': 31}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000704 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.406331 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.403762 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.399445 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.397679 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.395471 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.393805 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.39275 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.389347 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.388336 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.387793 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.388126 Early stopping, best iteration is: [10] valid_0's binary_logloss: 0.387793 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000380 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.400834 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.395641 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.39098 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.38318 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.377515 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.36907 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:19,538] Trial 1 finished with value: 0.30785907859078593 and parameters: {'neighbors': 7, 'num_leaves': 107, 'feature_fraction': 0.7137972448961027, 'bagging_fraction': 0.7648585770407605, 'bagging_freq': 2, 'min_child_samples': 70}. Best is trial 0 with value: 0.4617868675995694. [I 2022-03-01 17:31:19,672] Trial 2 finished with value: 0.4617868675995694 and parameters: {'neighbors': 7, 'num_leaves': 185, 'feature_fraction': 0.4403344560293559, 'bagging_fraction': 0.6689355433676443, 'bagging_freq': 2, 'min_child_samples': 46}. Best is trial 0 with value: 0.4617868675995694.
[7] valid_0's binary_logloss: 0.367343 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.359588 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.35673 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.353932 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.353246 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.35134 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.349937 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.348179 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.347072 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.346204 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.345858 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.343842 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.342984 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.34193 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.339709 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.337492 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.337643 Early stopping, best iteration is: [22] valid_0's binary_logloss: 0.337492 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000386 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.406373 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.404237 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.404417 Early stopping, best iteration is: [2] valid_0's binary_logloss: 0.404237
[I 2022-03-01 17:31:19,820] Trial 3 finished with value: 0.4617868675995694 and parameters: {'neighbors': 12, 'num_leaves': 20, 'feature_fraction': 0.7856558769631206, 'bagging_fraction': 0.4258870980509263, 'bagging_freq': 2, 'min_child_samples': 43}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000530 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.400108 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.39449 [3] valid_0's binary_logloss: 0.389125 [4] valid_0's binary_logloss: 0.380948 [5] valid_0's binary_logloss: 0.37636 [6] valid_0's binary_logloss: 0.368781 [7] valid_0's binary_logloss: 0.367636 [8] valid_0's binary_logloss: 0.360134 [9] valid_0's binary_logloss: 0.358628 [10] valid_0's binary_logloss: 0.35562 [11] valid_0's binary_logloss: 0.35449 [12] valid_0's binary_logloss: 0.352559 [13] valid_0's binary_logloss: 0.350806 [14] valid_0's binary_logloss: 0.34869 [15] valid_0's binary_logloss: 0.34817 [16] valid_0's binary_logloss: 0.348291 Early stopping, best iteration is: [15] valid_0's binary_logloss: 0.34817 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000513 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.404419 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402146 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.39949 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.397177 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.39305 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.388657 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.387302 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.385976 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.381427 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.379976 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.379442 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.376329
[I 2022-03-01 17:31:19,997] Trial 4 finished with value: 0.4617868675995694 and parameters: {'neighbors': 8, 'num_leaves': 256, 'feature_fraction': 0.5151810779827115, 'bagging_fraction': 0.8225336932989005, 'bagging_freq': 2, 'min_child_samples': 25}. Best is trial 0 with value: 0.4617868675995694. [I 2022-03-01 17:31:20,159] Trial 5 finished with value: 0.4617868675995694 and parameters: {'neighbors': 16, 'num_leaves': 40, 'feature_fraction': 0.45894312808773186, 'bagging_fraction': 0.477543533951297, 'bagging_freq': 3, 'min_child_samples': 73}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.374671 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.372128 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.372369 Early stopping, best iteration is: [14] valid_0's binary_logloss: 0.372128 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000354 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405265 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.403636 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.399178 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.398328 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.395179 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.393454 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.392806 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.388345 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.3877 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.386678 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.385147 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.384199 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.381407 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.379853 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.378516 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.376737 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.376379 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.374028 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.371069 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.369023 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.365865 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.363229 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.362299 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.362805 Early stopping, best iteration is: [23] valid_0's binary_logloss: 0.362299
[I 2022-03-01 17:31:20,320] Trial 6 finished with value: 0.3079964061096137 and parameters: {'neighbors': 14, 'num_leaves': 29, 'feature_fraction': 0.631352876269911, 'bagging_fraction': 0.9491682564465063, 'bagging_freq': 5, 'min_child_samples': 40}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000366 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.403257 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.397419 [3] valid_0's binary_logloss: 0.393456 [4] valid_0's binary_logloss: 0.387846 [5] valid_0's binary_logloss: 0.382666 [6] valid_0's binary_logloss: 0.377799 [7] valid_0's binary_logloss: 0.376139 [8] valid_0's binary_logloss: 0.370159 [9] valid_0's binary_logloss: 0.366341 [10] valid_0's binary_logloss: 0.364704 [11] valid_0's binary_logloss: 0.3629 [12] valid_0's binary_logloss: 0.35987 [13] valid_0's binary_logloss: 0.356999 [14] valid_0's binary_logloss: 0.35591 [15] valid_0's binary_logloss: 0.355181 [16] valid_0's binary_logloss: 0.352903 [17] valid_0's binary_logloss: 0.352577 [18] valid_0's binary_logloss: 0.351336 [19] valid_0's binary_logloss: 0.349148 [20] valid_0's binary_logloss: 0.348329 [21] valid_0's binary_logloss: 0.345031 [22] valid_0's binary_logloss: 0.342359 [23] valid_0's binary_logloss: 0.342445 Early stopping, best iteration is: [22] valid_0's binary_logloss: 0.342359 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000367 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.403671 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.398093 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.394065 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.388394 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.383861 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.379026 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.377584 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.372334 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.36922 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.36761 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.366747 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.364113 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.361431 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:20,493] Trial 7 finished with value: 0.30780372985696175 and parameters: {'neighbors': 13, 'num_leaves': 123, 'feature_fraction': 0.6295645079385951, 'bagging_fraction': 0.7026287335147489, 'bagging_freq': 7, 'min_child_samples': 64}. Best is trial 0 with value: 0.4617868675995694.
[14] valid_0's binary_logloss: 0.36009 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.358872 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.35816 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.358126 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.357733 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.356696 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.355229 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.352632 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.3488 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.348236 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.347215 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [25] valid_0's binary_logloss: 0.34597 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [26] valid_0's binary_logloss: 0.34508 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [27] valid_0's binary_logloss: 0.344605 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [28] valid_0's binary_logloss: 0.344235 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [29] valid_0's binary_logloss: 0.343657 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [30] valid_0's binary_logloss: 0.342319 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [31] valid_0's binary_logloss: 0.342839 Early stopping, best iteration is: [30] valid_0's binary_logloss: 0.342319 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000489 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405618 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.401484 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.397467 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.392995 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.389528 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.385913 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.383321 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.378355 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.376255 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.37492 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.373454 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.371351 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.368583 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.367198 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.36676 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.365275 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.364619 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.363271 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.361503 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.360519 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.357582 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.354303 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.354089 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:20,659] Trial 8 finished with value: 0.4617868675995694 and parameters: {'neighbors': 16, 'num_leaves': 221, 'feature_fraction': 0.6185583357953942, 'bagging_fraction': 0.9474951707530209, 'bagging_freq': 6, 'min_child_samples': 86}. Best is trial 0 with value: 0.4617868675995694. [I 2022-03-01 17:31:20,807] Trial 9 finished with value: 0.30813534917206625 and parameters: {'neighbors': 2, 'num_leaves': 92, 'feature_fraction': 0.8022359515059101, 'bagging_fraction': 0.6723649851322947, 'bagging_freq': 6, 'min_child_samples': 38}. Best is trial 0 with value: 0.4617868675995694.
[24] valid_0's binary_logloss: 0.354257 Early stopping, best iteration is: [23] valid_0's binary_logloss: 0.354089 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000470 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.39687 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.388579 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.379622 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.373708 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.368797 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.361635 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.36054 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.353147 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.349811 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.347528 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.346096 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.3446 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.341606 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.341071 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.341371 Early stopping, best iteration is: [14] valid_0's binary_logloss: 0.341071
[I 2022-03-01 17:31:20,997] Trial 10 finished with value: 0.3084249084249084 and parameters: {'neighbors': 3, 'num_leaves': 168, 'feature_fraction': 0.9618698595941324, 'bagging_fraction': 0.5530801589065851, 'bagging_freq': 4, 'min_child_samples': 11}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000521 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.385993 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.371575 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.361866 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.354453 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.34871 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.344129 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.343264 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.341252 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.339375 [10] valid_0's binary_logloss: 0.337994 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.337136 [12] valid_0's binary_logloss: 0.33799 Early stopping, best iteration is: [11] valid_0's binary_logloss: 0.337136 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000386 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`.
[I 2022-03-01 17:31:21,178] Trial 11 finished with value: 0.4617868675995694 and parameters: {'neighbors': 4, 'num_leaves': 178, 'feature_fraction': 0.4038503326118307, 'bagging_fraction': 0.5821332452396735, 'bagging_freq': 4, 'min_child_samples': 16}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.406036 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.4047 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.404618 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.404323 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.400775 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.399275 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.396764 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.396448 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.393795 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.392318 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.388656 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.386099 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.384759 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.38429 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.384427 Early stopping, best iteration is: [14] valid_0's binary_logloss: 0.38429 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000533 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405646 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.404482 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.400351 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.39887 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.395127 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.393732
[I 2022-03-01 17:31:21,356] Trial 12 finished with value: 0.4617868675995694 and parameters: {'neighbors': 4, 'num_leaves': 175, 'feature_fraction': 0.5174736350011806, 'bagging_fraction': 0.615136568276763, 'bagging_freq': 1, 'min_child_samples': 55}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.391999 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.387435 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.386645 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.385379 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.384033 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.381939 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.379549 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.378338 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.376152 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.374137 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.373656 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.371941 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.369109 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.366802 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.363861 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.360001 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.359186 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.359181 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [25] valid_0's binary_logloss: 0.359369 Early stopping, best iteration is: [24] valid_0's binary_logloss: 0.359181 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000349 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.406731 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.405136 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.403369 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.402415 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.398924 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.397789 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.395483 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:21,519] Trial 13 finished with value: 0.4617868675995694 and parameters: {'neighbors': 2, 'num_leaves': 213, 'feature_fraction': 0.4003981223688465, 'bagging_fraction': 0.8093876881958643, 'bagging_freq': 7, 'min_child_samples': 27}. Best is trial 0 with value: 0.4617868675995694. [I 2022-03-01 17:31:21,683] Trial 14 finished with value: 0.4617868675995694 and parameters: {'neighbors': 5, 'num_leaves': 152, 'feature_fraction': 0.49985388193331004, 'bagging_fraction': 0.5263257201224568, 'bagging_freq': 5, 'min_child_samples': 98}. Best is trial 0 with value: 0.4617868675995694.
[8] valid_0's binary_logloss: 0.395944 Early stopping, best iteration is: [7] valid_0's binary_logloss: 0.395483 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000562 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405792 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.403783 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.399946 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.397946 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.395153 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.393443 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.39266 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.388537 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.386574 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.385193 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.383307 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.383304 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.380542 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.379287 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.377336 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.376295 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.375354 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.374138 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.372181 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.370819 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.367742 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.364009 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.361907 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.361853 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [25] valid_0's binary_logloss: 0.36098 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [26] valid_0's binary_logloss: 0.359914 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [27] valid_0's binary_logloss: 0.359939 Early stopping, best iteration is: [26] valid_0's binary_logloss: 0.359914
[I 2022-03-01 17:31:21,861] Trial 15 finished with value: 0.4617868675995694 and parameters: {'neighbors': 3, 'num_leaves': 205, 'feature_fraction': 0.5720796118583584, 'bagging_fraction': 0.6637860065784509, 'bagging_freq': 1, 'min_child_samples': 52}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000380 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405908 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402199 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.39832 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.393493 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.390196 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.387483 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.384793 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.380165 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.37841 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.376854 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.374688 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.372791 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.370168 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.368817 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.368398 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.367807 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.367673 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.366758 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.364861 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.362969 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.359157 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.355634 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.354949 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.354996 Early stopping, best iteration is: [23] valid_0's binary_logloss: 0.354949 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000574 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`.
[I 2022-03-01 17:31:22,036] Trial 16 finished with value: 0.4617868675995694 and parameters: {'neighbors': 7, 'num_leaves': 72, 'feature_fraction': 0.4513653375181522, 'bagging_fraction': 0.7401566977439626, 'bagging_freq': 3, 'min_child_samples': 29}. Best is trial 0 with value: 0.4617868675995694.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.404145 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.402693 [3] valid_0's binary_logloss: 0.401998 [4] valid_0's binary_logloss: 0.401303 [5] valid_0's binary_logloss: 0.397926 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.392647 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.392005 [8] valid_0's binary_logloss: 0.389841 [9] valid_0's binary_logloss: 0.387923 [10] valid_0's binary_logloss: 0.387305 [11] valid_0's binary_logloss: 0.385747 [12] valid_0's binary_logloss: 0.381345 [13] valid_0's binary_logloss: 0.379626 [14] valid_0's binary_logloss: 0.377357 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.378302 Early stopping, best iteration is: [14] valid_0's binary_logloss: 0.377357 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000349 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.405595 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.403403 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.399244 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.397222 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.394199 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.392805 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.392152 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.388095 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.386264 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.384842 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.383434 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.382759 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:22,200] Trial 17 finished with value: 0.4617868675995694 and parameters: {'neighbors': 5, 'num_leaves': 127, 'feature_fraction': 0.5360231897918603, 'bagging_fraction': 0.5082551062822289, 'bagging_freq': 5, 'min_child_samples': 94}. Best is trial 0 with value: 0.4617868675995694.
[13] valid_0's binary_logloss: 0.380151 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.378856 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.376976 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.375826 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.374943 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.373777 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.371325 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.370024 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [21] valid_0's binary_logloss: 0.366802 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.362991 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.361279 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [24] valid_0's binary_logloss: 0.361198 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [25] valid_0's binary_logloss: 0.361135 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [26] valid_0's binary_logloss: 0.359786 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [27] valid_0's binary_logloss: 0.359923 Early stopping, best iteration is: [26] valid_0's binary_logloss: 0.359786 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000374 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.406282 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402594 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.398605 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.393715 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.390107 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.388256 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.385954 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.38206 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.380445 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.379331 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.376123 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [12] valid_0's binary_logloss: 0.373191 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [13] valid_0's binary_logloss: 0.370648 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [14] valid_0's binary_logloss: 0.369201 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [15] valid_0's binary_logloss: 0.367852 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [16] valid_0's binary_logloss: 0.365881 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [17] valid_0's binary_logloss: 0.365737 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [18] valid_0's binary_logloss: 0.364637 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [19] valid_0's binary_logloss: 0.362872 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [20] valid_0's binary_logloss: 0.360815 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[I 2022-03-01 17:31:22,382] Trial 18 finished with value: 0.4617868675995694 and parameters: {'neighbors': 3, 'num_leaves': 218, 'feature_fraction': 0.5805507735476287, 'bagging_fraction': 0.6214645851284905, 'bagging_freq': 1, 'min_child_samples': 56}. Best is trial 0 with value: 0.4617868675995694. [I 2022-03-01 17:31:22,556] Trial 19 finished with value: 0.3081625314635023 and parameters: {'neighbors': 10, 'num_leaves': 69, 'feature_fraction': 0.7003777608537448, 'bagging_fraction': 0.876787303856469, 'bagging_freq': 4, 'min_child_samples': 5}. Best is trial 0 with value: 0.4617868675995694.
[21] valid_0's binary_logloss: 0.357288 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [22] valid_0's binary_logloss: 0.35429 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [23] valid_0's binary_logloss: 0.354407 Early stopping, best iteration is: [22] valid_0's binary_logloss: 0.35429 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000831 seconds. You can set `force_col_wise=true` to remove the overhead. [1] valid_0's binary_logloss: 0.39564 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.386759 [3] valid_0's binary_logloss: 0.380335 [4] valid_0's binary_logloss: 0.379542 [5] valid_0's binary_logloss: 0.373091 [6] valid_0's binary_logloss: 0.371623 [7] valid_0's binary_logloss: 0.369262 [8] valid_0's binary_logloss: 0.363652 [9] valid_0's binary_logloss: 0.359611 [10] valid_0's binary_logloss: 0.353286 [11] valid_0's binary_logloss: 0.349558 [12] valid_0's binary_logloss: 0.346387 [13] valid_0's binary_logloss: 0.346047 [14] valid_0's binary_logloss: 0.346259 Early stopping, best iteration is: [13] valid_0's binary_logloss: 0.346047
[I 2022-03-01 17:31:25,652] A new study created in memory with name: no-name-7ab8202b-81e0-48a1-b9a0-e01266c04cae
21 26 123518
[I 2022-03-01 17:31:25,953] Trial 0 finished with value: 0.4626544868350349 and parameters: {'neighbors': 2, 'num_leaves': 115, 'feature_fraction': 0.8414673863714155, 'bagging_fraction': 0.5192146843669733, 'bagging_freq': 7, 'min_child_samples': 69}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:26,094] Trial 1 finished with value: 0.4626544868350349 and parameters: {'neighbors': 5, 'num_leaves': 49, 'feature_fraction': 0.5018727582699055, 'bagging_fraction': 0.44950638361575196, 'bagging_freq': 6, 'min_child_samples': 82}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000421 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.398548 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.39647 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.39558 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.391964 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.389832 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.38771 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.387294 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.38666 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.386791 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.38666 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000395 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.401736 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.400539 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.398565 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.396692 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.39635 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.394688 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.394207 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.394398 Early stopping, best iteration is: [7] valid_0's binary_logloss: 0.394207
[I 2022-03-01 17:31:26,249] Trial 2 finished with value: 0.30824372759856633 and parameters: {'neighbors': 3, 'num_leaves': 71, 'feature_fraction': 0.8798014527090778, 'bagging_fraction': 0.8095049864051115, 'bagging_freq': 5, 'min_child_samples': 35}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:26,392] Trial 3 finished with value: 0.4626544868350349 and parameters: {'neighbors': 3, 'num_leaves': 42, 'feature_fraction': 0.7228828879540345, 'bagging_fraction': 0.723006658781091, 'bagging_freq': 6, 'min_child_samples': 9}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000507 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.393802 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.39028 [3] valid_0's binary_logloss: 0.388199 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.385025 [5] valid_0's binary_logloss: 0.382128 [6] valid_0's binary_logloss: 0.381911 [7] valid_0's binary_logloss: 0.381503 [8] valid_0's binary_logloss: 0.378313 [9] valid_0's binary_logloss: 0.376916 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.376637 [11] valid_0's binary_logloss: 0.377012 Early stopping, best iteration is: [10] valid_0's binary_logloss: 0.376637 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000455 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.398792 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.3958 [3] valid_0's binary_logloss: 0.393483 [4] valid_0's binary_logloss: 0.390628 [5] valid_0's binary_logloss: 0.388336 [6] valid_0's binary_logloss: 0.384997 [7] valid_0's binary_logloss: 0.385162 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.384997
[I 2022-03-01 17:31:26,527] Trial 4 finished with value: 0.4626544868350349 and parameters: {'neighbors': 3, 'num_leaves': 122, 'feature_fraction': 0.604696272937773, 'bagging_fraction': 0.8672451107516373, 'bagging_freq': 1, 'min_child_samples': 100}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:26,681] Trial 5 finished with value: 0.4626544868350349 and parameters: {'neighbors': 10, 'num_leaves': 219, 'feature_fraction': 0.9365167998999511, 'bagging_fraction': 0.7833521711366491, 'bagging_freq': 6, 'min_child_samples': 58}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000436 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402508 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.401992 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.400447 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.398867 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.399354 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.398867 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000407 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.399183 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.396913 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.389363 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.386488 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.383052 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.381955 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.379904 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.37758 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.377702 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.37758
[I 2022-03-01 17:31:26,840] Trial 6 finished with value: 0.4626544868350349 and parameters: {'neighbors': 19, 'num_leaves': 143, 'feature_fraction': 0.9241649622159405, 'bagging_fraction': 0.929782584717493, 'bagging_freq': 2, 'min_child_samples': 47}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:26,984] Trial 7 finished with value: 0.4626544868350349 and parameters: {'neighbors': 5, 'num_leaves': 79, 'feature_fraction': 0.5668904498633303, 'bagging_fraction': 0.6687817263192997, 'bagging_freq': 3, 'min_child_samples': 58}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000616 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.397668 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.39464 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.388368 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.386817 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.384186 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.383315 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.381997 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.378181 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.377435 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.377639 Early stopping, best iteration is: [9] valid_0's binary_logloss: 0.377435 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000408 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402831 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402708 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.401797 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.401256 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.401102 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.40074 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.40022 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.400802 Early stopping, best iteration is: [7] valid_0's binary_logloss: 0.40022
[I 2022-03-01 17:31:27,120] Trial 8 finished with value: 0.4626544868350349 and parameters: {'neighbors': 2, 'num_leaves': 22, 'feature_fraction': 0.731067357016334, 'bagging_fraction': 0.5915805394005955, 'bagging_freq': 3, 'min_child_samples': 6}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:27,273] Trial 9 finished with value: 0.4626544868350349 and parameters: {'neighbors': 16, 'num_leaves': 111, 'feature_fraction': 0.7432291451777941, 'bagging_fraction': 0.9006685283709022, 'bagging_freq': 3, 'min_child_samples': 39}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000455 seconds. You can set `force_col_wise=true` to remove the overhead. [1] valid_0's binary_logloss: 0.399229 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.397369 [3] valid_0's binary_logloss: 0.394917 [4] valid_0's binary_logloss: 0.391454 [5] valid_0's binary_logloss: 0.38826 [6] valid_0's binary_logloss: 0.384961 [7] valid_0's binary_logloss: 0.385692 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.384961 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000484 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.398969 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.39615 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.393349 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.392436 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.390523 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.388597 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.38665 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.385199 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.384959 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.386067 Early stopping, best iteration is: [9] valid_0's binary_logloss: 0.384959
[I 2022-03-01 17:31:27,425] Trial 10 finished with value: 0.4626544868350349 and parameters: {'neighbors': 2, 'num_leaves': 186, 'feature_fraction': 0.40657188246076853, 'bagging_fraction': 0.4608468783229307, 'bagging_freq': 7, 'min_child_samples': 77}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:27,572] Trial 11 finished with value: 0.4626544868350349 and parameters: {'neighbors': 6, 'num_leaves': 3, 'feature_fraction': 0.5130194020840952, 'bagging_fraction': 0.40920800020206244, 'bagging_freq': 7, 'min_child_samples': 84}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000768 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.403307 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402956 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.402559 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.402072 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.401803 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.401059 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.400919 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.400591 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.400712 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.400591 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000747 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.402604 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.402455 [3] valid_0's binary_logloss: 0.402299 [4] valid_0's binary_logloss: 0.401744 [5] valid_0's binary_logloss: 0.401434 [6] valid_0's binary_logloss: 0.401088 [7] valid_0's binary_logloss: 0.400779 [8] valid_0's binary_logloss: 0.401201 Early stopping, best iteration is: [7] valid_0's binary_logloss: 0.400779
[I 2022-03-01 17:31:27,721] Trial 12 finished with value: 0.4626544868350349 and parameters: {'neighbors': 6, 'num_leaves': 160, 'feature_fraction': 0.8216882668558992, 'bagging_fraction': 0.5243249164567009, 'bagging_freq': 5, 'min_child_samples': 75}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:27,878] Trial 13 finished with value: 0.4626544868350349 and parameters: {'neighbors': 9, 'num_leaves': 81, 'feature_fraction': 0.4126617382083356, 'bagging_fraction': 0.5645123153806042, 'bagging_freq': 6, 'min_child_samples': 92}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000495 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.39867 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.396277 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.393381 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.389389 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.388172 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.387876 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.387164 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.385629 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.386168 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.385629 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000375 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402797 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.401671 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.401074 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.399795 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.399758 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.39867 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.398641 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.399069 Early stopping, best iteration is: [7] valid_0's binary_logloss: 0.398641
[I 2022-03-01 17:31:28,028] Trial 14 finished with value: 0.4626544868350349 and parameters: {'neighbors': 4, 'num_leaves': 54, 'feature_fraction': 0.9998390120653952, 'bagging_fraction': 0.4841070472683704, 'bagging_freq': 7, 'min_child_samples': 68}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:28,188] Trial 15 finished with value: 0.4626544868350349 and parameters: {'neighbors': 9, 'num_leaves': 104, 'feature_fraction': 0.6380053495593385, 'bagging_fraction': 0.6396455383525539, 'bagging_freq': 5, 'min_child_samples': 67}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000535 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.392933 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.387471 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.383381 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.378508 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.378518 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.378508 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000399 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402307 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.4021 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.400992 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.400539 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.400161 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.399912 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.398114 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.397687 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.397732 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.397687
[I 2022-03-01 17:31:28,336] Trial 16 finished with value: 0.4626544868350349 and parameters: {'neighbors': 2, 'num_leaves': 244, 'feature_fraction': 0.4796700598083577, 'bagging_fraction': 0.42626065829804644, 'bagging_freq': 4, 'min_child_samples': 84}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:28,498] Trial 17 finished with value: 0.4626544868350349 and parameters: {'neighbors': 4, 'num_leaves': 169, 'feature_fraction': 0.8421958854153012, 'bagging_fraction': 0.5127076127065336, 'bagging_freq': 7, 'min_child_samples': 23}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000475 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.401918 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.400864 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.399757 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.39984 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.399757 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000537 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.397838 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.394027 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.389903 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.38665 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.383993 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.38177 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.382395 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.38177
[I 2022-03-01 17:31:28,652] Trial 18 finished with value: 0.4626544868350349 and parameters: {'neighbors': 10, 'num_leaves': 104, 'feature_fraction': 0.64791326997118, 'bagging_fraction': 0.6481144315632897, 'bagging_freq': 5, 'min_child_samples': 64}. Best is trial 0 with value: 0.4626544868350349. [I 2022-03-01 17:31:28,803] Trial 19 finished with value: 0.4626544868350349 and parameters: {'neighbors': 2, 'num_leaves': 256, 'feature_fraction': 0.8106512151971612, 'bagging_fraction': 0.4070022398063616, 'bagging_freq': 4, 'min_child_samples': 90}. Best is trial 0 with value: 0.4626544868350349.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000491 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402352 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.402396 Early stopping, best iteration is: [1] valid_0's binary_logloss: 0.402352 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000522 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.402656 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.399449 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.39904 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.398954 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.398776 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.397842 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.396296 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.394083 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [9] valid_0's binary_logloss: 0.392461 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [10] valid_0's binary_logloss: 0.391413 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [11] valid_0's binary_logloss: 0.391875 Early stopping, best iteration is: [10] valid_0's binary_logloss: 0.391413
[I 2022-03-01 17:31:29,552] A new study created in memory with name: no-name-ac850d54-ef38-47c6-8f2f-eb3dcb918362
26 31 67138
[I 2022-03-01 17:31:29,814] Trial 0 finished with value: 0.4692144373673036 and parameters: {'neighbors': 2, 'num_leaves': 130, 'feature_fraction': 0.7006697050976558, 'bagging_fraction': 0.9226721145541669, 'bagging_freq': 5, 'min_child_samples': 98}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:29,967] Trial 1 finished with value: 0.4692144373673036 and parameters: {'neighbors': 17, 'num_leaves': 47, 'feature_fraction': 0.7094729755828693, 'bagging_fraction': 0.4779397181801662, 'bagging_freq': 1, 'min_child_samples': 33}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000417 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357308 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.356254 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.355479 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.354679 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.355036 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.354679 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000529 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357707 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.356623 [3] valid_0's binary_logloss: 0.355379 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.35434 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.353753 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.353493 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.353367 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [8] valid_0's binary_logloss: 0.350838 [9] valid_0's binary_logloss: 0.351934 Early stopping, best iteration is: [8] valid_0's binary_logloss: 0.350838
[I 2022-03-01 17:31:30,116] Trial 2 finished with value: 0.4692144373673036 and parameters: {'neighbors': 17, 'num_leaves': 51, 'feature_fraction': 0.4895587288135552, 'bagging_fraction': 0.9376484143760243, 'bagging_freq': 1, 'min_child_samples': 71}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:30,261] Trial 3 finished with value: 0.4692144373673036 and parameters: {'neighbors': 13, 'num_leaves': 246, 'feature_fraction': 0.5225203300327946, 'bagging_fraction': 0.9517394337334998, 'bagging_freq': 1, 'min_child_samples': 56}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000420 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.358645 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.357727 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.356651 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.356218 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.355437 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.355537 Early stopping, best iteration is: [5] valid_0's binary_logloss: 0.355437 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000384 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.35907 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.358096 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.357882 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.35809 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.357882
[I 2022-03-01 17:31:30,392] Trial 4 finished with value: 0.4692144373673036 and parameters: {'neighbors': 2, 'num_leaves': 153, 'feature_fraction': 0.8941470632979704, 'bagging_fraction': 0.5442194754838728, 'bagging_freq': 2, 'min_child_samples': 63}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:30,526] Trial 5 finished with value: 0.4692144373673036 and parameters: {'neighbors': 5, 'num_leaves': 3, 'feature_fraction': 0.6036038806706959, 'bagging_fraction': 0.5976836393299501, 'bagging_freq': 1, 'min_child_samples': 62}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000662 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357133 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.356223 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.356639 Early stopping, best iteration is: [2] valid_0's binary_logloss: 0.356223 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000452 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.357611 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.356349 [3] valid_0's binary_logloss: 0.355683 [4] valid_0's binary_logloss: 0.355314 [5] valid_0's binary_logloss: 0.354999 [6] valid_0's binary_logloss: 0.355238 Early stopping, best iteration is: [5] valid_0's binary_logloss: 0.354999
[I 2022-03-01 17:31:30,647] Trial 6 finished with value: 0.4692144373673036 and parameters: {'neighbors': 2, 'num_leaves': 3, 'feature_fraction': 0.8813488878524807, 'bagging_fraction': 0.8660724105400621, 'bagging_freq': 6, 'min_child_samples': 68}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:30,797] Trial 7 finished with value: 0.4692144373673036 and parameters: {'neighbors': 4, 'num_leaves': 136, 'feature_fraction': 0.5064260466535558, 'bagging_fraction': 0.5432796980361574, 'bagging_freq': 4, 'min_child_samples': 7}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000533 seconds. You can set `force_col_wise=true` to remove the overhead. [1] valid_0's binary_logloss: 0.357964 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.357379 [3] valid_0's binary_logloss: 0.357093 [4] valid_0's binary_logloss: 0.356251 [5] valid_0's binary_logloss: 0.356518 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.356251 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000741 seconds. You can set `force_col_wise=true` to remove the overhead. [1] valid_0's binary_logloss: 0.356712 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.355337 [3] valid_0's binary_logloss: 0.353758 [4] valid_0's binary_logloss: 0.353989 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.353758
[I 2022-03-01 17:31:30,938] Trial 8 finished with value: 0.4692144373673036 and parameters: {'neighbors': 4, 'num_leaves': 252, 'feature_fraction': 0.6437807845988475, 'bagging_fraction': 0.787992045183467, 'bagging_freq': 1, 'min_child_samples': 39}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:31,084] Trial 9 finished with value: 0.4692144373673036 and parameters: {'neighbors': 14, 'num_leaves': 59, 'feature_fraction': 0.5249499206279414, 'bagging_fraction': 0.7755555180998641, 'bagging_freq': 5, 'min_child_samples': 69}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000445 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357914 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.357806 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.35632 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.356162 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.356814 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.356162 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000617 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.35671 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.355111 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.354498 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.354386 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.354495 Early stopping, best iteration is: [4] valid_0's binary_logloss: 0.354386
[I 2022-03-01 17:31:31,233] Trial 10 finished with value: 0.4692144373673036 and parameters: {'neighbors': 7, 'num_leaves': 181, 'feature_fraction': 0.7868510573123638, 'bagging_fraction': 0.677556558514678, 'bagging_freq': 7, 'min_child_samples': 100}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:31,396] Trial 11 finished with value: 0.4692144373673036 and parameters: {'neighbors': 8, 'num_leaves': 90, 'feature_fraction': 0.7701903809804411, 'bagging_fraction': 0.41720321906851787, 'bagging_freq': 3, 'min_child_samples': 26}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000715 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357088 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.3558 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.355394 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.355519 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.355394 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000664 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.356765 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.355708 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.3535 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.353126 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.352673 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.352428 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.352997 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.352428
[I 2022-03-01 17:31:31,541] Trial 12 finished with value: 0.4692144373673036 and parameters: {'neighbors': 3, 'num_leaves': 98, 'feature_fraction': 0.7262431818277003, 'bagging_fraction': 0.45499924315469736, 'bagging_freq': 4, 'min_child_samples': 93}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:31,689] Trial 13 finished with value: 0.4692144373673036 and parameters: {'neighbors': 9, 'num_leaves': 181, 'feature_fraction': 0.649587161744817, 'bagging_fraction': 0.6679054830325064, 'bagging_freq': 5, 'min_child_samples': 26}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000424 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357293 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.35655 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.355728 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.354892 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.354418 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.353687 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.35378 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.353687 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000439 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357841 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.358907 Early stopping, best iteration is: [1] valid_0's binary_logloss: 0.357841
[I 2022-03-01 17:31:31,835] Trial 14 finished with value: 0.4692144373673036 and parameters: {'neighbors': 3, 'num_leaves': 103, 'feature_fraction': 0.9757724187733865, 'bagging_fraction': 0.7997081036603675, 'bagging_freq': 5, 'min_child_samples': 85}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:31,991] Trial 15 finished with value: 0.4692144373673036 and parameters: {'neighbors': 11, 'num_leaves': 55, 'feature_fraction': 0.4096327690779591, 'bagging_fraction': 0.8778842802513084, 'bagging_freq': 3, 'min_child_samples': 42}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000460 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.356658 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.355168 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.354675 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.354714 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.354675 [LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000383 seconds. You can set `force_row_wise=true` to remove the overhead. And if memory is not enough, you can set `force_col_wise=true`. [1] valid_0's binary_logloss: 0.356873 Training until validation scores don't improve for 1 rounds [2] valid_0's binary_logloss: 0.356283 [3] valid_0's binary_logloss: 0.355447 [4] valid_0's binary_logloss: 0.355099 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.353942 [6] valid_0's binary_logloss: 0.353958 Early stopping, best iteration is: [5] valid_0's binary_logloss: 0.353942
[I 2022-03-01 17:31:32,148] Trial 16 finished with value: 0.4692144373673036 and parameters: {'neighbors': 20, 'num_leaves': 204, 'feature_fraction': 0.8278336451614028, 'bagging_fraction': 0.48923702143038994, 'bagging_freq': 7, 'min_child_samples': 43}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:32,297] Trial 17 finished with value: 0.4692144373673036 and parameters: {'neighbors': 3, 'num_leaves': 104, 'feature_fraction': 0.9972828306537637, 'bagging_fraction': 0.7945522457803047, 'bagging_freq': 5, 'min_child_samples': 83}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000535 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357066 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.356461 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.354186 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.35469 Early stopping, best iteration is: [3] valid_0's binary_logloss: 0.354186 [LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000954 seconds. You can set `force_col_wise=true` to remove the overhead. [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [1] valid_0's binary_logloss: 0.357786 Training until validation scores don't improve for 1 rounds [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [2] valid_0's binary_logloss: 0.35647 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [3] valid_0's binary_logloss: 0.356155 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [4] valid_0's binary_logloss: 0.3557 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [5] valid_0's binary_logloss: 0.354951 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [6] valid_0's binary_logloss: 0.353345 [LightGBM] [Warning] No further splits with positive gain, best gain: -inf [7] valid_0's binary_logloss: 0.353608 Early stopping, best iteration is: [6] valid_0's binary_logloss: 0.353345
[I 2022-03-01 17:31:32,447] Trial 18 finished with value: 0.4692144373673036 and parameters: {'neighbors': 6, 'num_leaves': 137, 'feature_fraction': 0.422937544868833, 'bagging_fraction': 0.9874367393998925, 'bagging_freq': 3, 'min_child_samples': 49}. Best is trial 0 with value: 0.4692144373673036. [I 2022-03-01 17:31:32,597] Trial 19 finished with value: 0.4692144373673036 and parameters: {'neighbors': 2, 'num_leaves': 224, 'feature_fraction': 0.842156111855853, 'bagging_fraction': 0.6094454170717349, 'bagging_freq': 7, 'min_child_samples': 10}. Best is trial 0 with value: 0.4692144373673036.
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000408 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[1] valid_0's binary_logloss: 0.356848
Training until validation scores don't improve for 1 rounds
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[2] valid_0's binary_logloss: 0.354756
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[3] valid_0's binary_logloss: 0.354898
Early stopping, best iteration is:
[2] valid_0's binary_logloss: 0.354756
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000436 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[1] valid_0's binary_logloss: 0.35592
Training until validation scores don't improve for 1 rounds
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[2] valid_0's binary_logloss: 0.357961
Early stopping, best iteration is:
[1] valid_0's binary_logloss: 0.35592
FrozenTrial(number=0, values=[0.4692144373673036], datetime_start=datetime.datetime(2022, 3, 1, 17, 31, 29, 736218), datetime_complete=datetime.datetime(2022, 3, 1, 17, 31, 29, 813218), params={'neighbors': 2, 'num_leaves': 130, 'feature_fraction': 0.7006697050976558, 'bagging_fraction': 0.9226721145541669, 'bagging_freq': 5, 'min_child_samples': 98}, distributions={'neighbors': IntLogUniformDistribution(high=20, low=2, step=1), 'num_leaves': IntUniformDistribution(high=256, low=2, step=1), 'feature_fraction': UniformDistribution(high=1.0, low=0.4), 'bagging_fraction': UniformDistribution(high=1.0, low=0.4), 'bagging_freq': IntUniformDistribution(high=7, low=1, step=1), 'min_child_samples': IntUniformDistribution(high=100, low=5, step=1)}, user_attrs={}, system_attrs={}, intermediate_values={}, trial_id=0, state=TrialState.COMPLETE, value=None)
Below is the output of the graphs from optimization. The models produced are optimized for age so you will see a set of graphs for each age group.
for fig in plots:
fig.show()
Next I train the model based on the optimized parameters and then predict the label.
print(best_models)
[[{'neighbors': 3, 'num_leaves': 173, 'feature_fraction': 0.5186323020532635, 'bagging_fraction': 0.6163552066220521, 'bagging_freq': 6, 'min_child_samples': 31}, 16, 5, None], [{'neighbors': 2, 'num_leaves': 115, 'feature_fraction': 0.8414673863714155, 'bagging_fraction': 0.5192146843669733, 'bagging_freq': 7, 'min_child_samples': 69}, 21, 5, None], [{'neighbors': 2, 'num_leaves': 130, 'feature_fraction': 0.7006697050976558, 'bagging_fraction': 0.9226721145541669, 'bagging_freq': 5, 'min_child_samples': 98}, 26, 5, None]]
class train(object):
def __init__(self,X,model_params):
self.model_params = model_params
i = 0
labels =[]
for each in range(5,len(X.columns)):
labels.extend(X.iloc[:,each].tolist())
i+=1
labels = list(set(labels))
encoder = LabelEncoder()
fit = encoder.fit(labels)
self.fit = fit
self.null = fit.transform(['1'])
i = 0
for each in range(5,len(X.columns)):
original_column = f'{i}_week'
new_column = f'{i}_weeks'
X.loc[:, (f'{i}_weeks')] = fit.transform(X[original_column]).astype(int)
#X[f'{i}_weeks']=X[f'{i}_weeks'].replace(0, np.nan)
X = X.drop(columns=[f'{i}_week'])
i+=1
#X[f'{i+1}_week']='1'
N = len(X.columns)
last_column = N-1
second_to_last_column = N-2
self.X = X.iloc[: , 0:second_to_last_column]
self.y = X.iloc[: ,last_column]
X = X.drop(columns=['0_weeks'])
X = X.drop(columns=['1_weeks'])
X = X.drop(columns=['2_weeks'])
d = 3
i = 3
for each in range(5,len(X.columns)):
original_column = f'{i}_weeks'
new_column = f'{i-d}_weeks'
X[new_column] = X[original_column].astype(int)
X = X.drop(columns=[original_column])
i+=1
print(self.null)
X[f'{i-d}_weeks'] = self.null[0]
N = len(X.columns)
last_column = N
self.X_pred = X.iloc[: , 0:last_column]
def knn(self):
neigh = KNeighborsClassifier(n_neighbors=self.model_params['neighbors'])
neigh.fit(self.X,self.y)
neighbors = pd.DataFrame(neigh.kneighbors(return_distance=False))
prediction = pd.DataFrame(neigh.predict(self.X))
return neigh, neighbors, prediction
def data(self, X_data, neighbors, prediction):
X = X_data.join(neighbors)
X = X_data.join(prediction)
self.X = X
y = self.y
y[self.y == self.null[0]] = 1
y[self.y != 1] = -1
return X, y
def lightgbm(self,neighbors, prediction):
X, y = self.data(self.X,neighbors,prediction)
dtrain = lgb.Dataset(X, label=y)
param = {
"objective": "binary",
"metric": "binary_logloss",
"verbosity": 1,
"boosting_type": "gbdt",
"num_leaves": self.model_params['num_leaves'],
"feature_fraction": self.model_params['feature_fraction'],
"bagging_fraction": self.model_params['bagging_fraction'],
"bagging_freq": self.model_params['bagging_freq'],
"min_child_samples": self.model_params['min_child_samples']
}
gbm = lgb.train(param, dtrain)
return gbm
def train(self):
model_knn, neighbors, prediction = self.knn()
model_lgb = self.lightgbm(neighbors, prediction)
return model_lgb, model_knn
def predict(self,age_study,model_lgb, model_knn):
neighbors = pd.DataFrame(model_knn.kneighbors(return_distance=False))
y_pred_knn = pd.DataFrame(model_knn.predict(self.X_pred))
X_pred = self.data(self.X_pred, neighbors, y_pred_knn)
y_pred_lgb = pd.DataFrame(model_lgb.predict(X_pred[0]))
return y_pred_knn.values.tolist(), y_pred_lgb.values.tolist(), self.fit, X_pred[0]['customer_id']
if __name__ == "__main__":
predictions = []
y_pred = []
r = []
customer_ids = []
for each in best_models:
print(each)
young_age = each[1]
old_age = young_age+each[2]
print(old_age)
age_study = X.loc[(X['age'] >= young_age) & (X['age'] <= old_age)]
t = train(age_study,each[0])
model_lgb, model_knn = t.train()
y_pred_knn, y_pred_lgb, encoder, customer_id = t.predict(age_study, model_lgb, model_knn)
i = 0
print(f'age_length {len(age_study)}')
print(f'knn_ prediction {len(y_pred_knn)}')
print(f'lgb_ prediction {len(y_pred_lgb)}')
print(f'customer_id {len(customer_id)}')
print(age_study.columns)
customer_id = customer_id.to_list()
while i < len(y_pred_knn):
k = round(y_pred_lgb[i][0])
m = encoder.inverse_transform([round(y_pred_knn[i][0])])
if k == 1:
r.append('1')
else:
r.append(str(m[0]))
i+=1
customer_ids.extend(customer_id)
[{'neighbors': 3, 'num_leaves': 173, 'feature_fraction': 0.5186323020532635, 'bagging_fraction': 0.6163552066220521, 'bagging_freq': 6, 'min_child_samples': 31}, 16, 5, None]
21
d:\soprisanalytics\kaggle\dtsa-5509-supervised-learning-final-project\venv\lib\site-packages\pandas\core\indexing.py:1667: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
[0]
[LightGBM] [Info] Number of positive: 56415, number of negative: 9151
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002873 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 2618
[LightGBM] [Info] Number of data points in the train set: 65566, number of used features: 14
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.860431 -> initscore=1.818872
[LightGBM] [Info] Start training from score 1.818872
age_length 65566
knn_ prediction 65566
lgb_ prediction 65566
customer_id 65566
Index(['postal_codes', 'fashion_news_frequency', 'club_member_status',
'customer_id', 'age', '0_week', '1_week', '2_week', '3_week', '4_week',
'5_week', '6_week', '7_week', '8_week', '9_week', '0_weeks'],
dtype='object')
[{'neighbors': 2, 'num_leaves': 115, 'feature_fraction': 0.8414673863714155, 'bagging_fraction': 0.5192146843669733, 'bagging_freq': 7, 'min_child_samples': 69}, 21, 5, None]
26
d:\soprisanalytics\kaggle\dtsa-5509-supervised-learning-final-project\venv\lib\site-packages\pandas\core\indexing.py:1667: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
[0]
[LightGBM] [Info] Number of positive: 104186, number of negative: 19332
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.005148 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 2820
[LightGBM] [Info] Number of data points in the train set: 123518, number of used features: 14
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.843488 -> initscore=1.684416
[LightGBM] [Info] Start training from score 1.684416
age_length 123518
knn_ prediction 123518
lgb_ prediction 123518
customer_id 123518
Index(['postal_codes', 'fashion_news_frequency', 'club_member_status',
'customer_id', 'age', '0_week', '1_week', '2_week', '3_week', '4_week',
'5_week', '6_week', '7_week', '8_week', '9_week', '0_weeks'],
dtype='object')
[{'neighbors': 2, 'num_leaves': 130, 'feature_fraction': 0.7006697050976558, 'bagging_fraction': 0.9226721145541669, 'bagging_freq': 5, 'min_child_samples': 98}, 26, 5, None]
31
d:\soprisanalytics\kaggle\dtsa-5509-supervised-learning-final-project\venv\lib\site-packages\pandas\core\indexing.py:1667: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
[0]
[LightGBM] [Info] Number of positive: 56660, number of negative: 10478
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.002935 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 2710
[LightGBM] [Info] Number of data points in the train set: 67138, number of used features: 14
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.843933 -> initscore=1.687791
[LightGBM] [Info] Start training from score 1.687791
age_length 67138
knn_ prediction 67138
lgb_ prediction 67138
customer_id 67138
Index(['postal_codes', 'fashion_news_frequency', 'club_member_status',
'customer_id', 'age', '0_week', '1_week', '2_week', '3_week', '4_week',
'5_week', '6_week', '7_week', '8_week', '9_week', '0_weeks'],
dtype='object')
print(len(customer_ids))
print(len(r))
d = {'Customer_Ids':customer_ids,'Prediction':r}
df = pd.DataFrame(d)
df['Customer_Ids'] = customer_id_encoding.inverse_transform(df['Customer_Ids'])
print(df)
256222
256222
Customer_Ids Prediction
0 588e32d81a3c5eda8439b88654cb2f7acb336c9e028ca7... 1
1 5374130b2d260391c22de34dd9237c9c5bcc320b1fec31... 1
2 7beee2baccfda3501868a56642b738bc52bcf1804d4612... 1
3 f0d47c078e1ae6a6c3552a8bd7c7c4226db650136b487a... 1
4 fd90cc5dabce374693d54684cc8dba552f2a40c9a2619f... 762846027
... ... ...
256217 5a97b2a5f70bc5f421a3f9a41067985a2c9dd1fa6545da... 1
256218 3048faedf0a20e13a89e93e3e4d33b325ad9f7cd625933... 1
256219 6984408761c01ab4db16ecb7b4b934b6a5958853033bc9... 1
256220 672b61c9c5af04c8e38e98214433c6a9f4dd1386c2b5aa... 1
256221 c6cc3dc177527df77d43647414aa12938ebe140e9c77a4... 1
[256222 rows x 2 columns]